Making C Programs
Make
-
The make facility is an important tool that you will learn ito use n this
course. It will be valuable in all of your computing science courses
-
make can automatically construct the executable form of a program
from a description of the source files and the dependencies between those
files
-
Once you have produced a Makefile for a program you don't need to remember
how to compile it, or even which files have changed, make handles all of
this automatically for you
-
The main input to make is a Makefile, this is basically a description of
how to build and executable system
-
The Makefile is divided into two main parts:
-
macro definitions
-
dependency declarations
Macro Definitions
-
Macros are used extensively in C programming, get used to them
-
A macro is a text replacement facility
-
A macro has a name, which is a character string, and a body, which is also
a character string, wherever the name occurs in the file it is replaced
by the body of the macro
-
In make we use macros to describe the compiler we are using, the compiler
options, and the list of files we need. It allows us to parameterize the
Makefile
-
A macro definition has the following format:
macro-name = macro-body
A macro definition can appear anywhere in a Makefile
In make, a macro is invoked by mentioning its name, preceded by a $, if
the macro is more than one character long the name must be placed in parenthesis
In our example Makefile we have
FILES = main.o phone.o
When we use $(FILES) in the rest of the Makefile it is replaced by the
two file names, as we add more files to the program we only need to change
the line where FILES is declared, the rest of the Makefile stays the same
- this saves time and reduces mistakes
Similarly we have the lines
CC =gcc
CFLAGS = -g
The first line defines the C compiler that we use and the second line defines
the compiler flags. Again we can quickly change the compiler options
without scanning through the entire Makefile
Dependency Declarations
-
The dependency declarations tell make which files depend on which other
files
-
In the case of our example, the file phone (the executable for our program)
depends on both main.o and phone.o, we say this in the following way:
phone: main.o phone.o
The files on the left of the : depend on the files listed on the right.
Here that means that the executable module 'phone' depends on the two object
modules, main.o and phone.o
Similarly, both main.o and phone.o depend on phone.h, there are two ways
that we could specify this
main.o : phone.h
phone.o : phone.h
or
main.o phone.o : phone.h
After each dependency declaration we can list the commands that construct
the dependent files, these commands must be indented, there must be either
spaces or tabs at the beginning of the line. WARNING: Old versions of make
(i.e. the version in your lab.) require tabs.
So in order to create phone, we use the following specification
phone : main.o phone.o
gcc -o phone main.o phone.o
When we need to re-create phone, make will automatically execute the gcc
command
When is a dependency declaration used?
When make is started it looks at the time of last modification of all the
files mentioned in the Makefile, if the last time of modification of a
file on the right side of a dependency declaration is more recent than
one of the files on the left side then the commands are executed
In our example, if either main.o or phone.o is more recent than phone,
the gcc command will be executed, otherwise make knows that phone is up-to-date
and does nothing
How does make know how to make main.o and phone.o, there are no commands
associated with their dependencies?
Make has a set of default rules that know about Unix's file naming conventions,
we use suffixes to indicate the type of file, start with a dot ( . ) and
have one or more letters, the common ones are:
-
.c a c program
-
.h a header file
-
.o an object file
-
.s an assembler file
Make uses the file suffixes to determine how to make a file, if make needs
a .o file and can find a .c file with the same prefix, it knows that it
can use the C compiler to produce the .o file
For example if name.o is required, mentioned on the right side of a dependency
rule, but there is no rule for name.o then make will look for a file named
name.c, and if it finds one will run the c compiler on it, the use of default
rules saves on the amount of typing you must do--BUT DON'T RELY ON THEM
IN C201
You can construct your own default rules
If make is started without arguments, it will use the first dependency
declaration as its target, that is it will make the file on the left side
of the first dependency rule in the Makefile
In our example Makefile, phone is the first file mentioned, so by default
make will execute that command first
You can specify the target when you call make, if we enter the command
make main.o
then make will only run the c compiler on main.c to produce main.o, it
will not produce a new version of phone
The most common file that you produce should be in the first dependency
declaration, so you don't need to mention it each time you use make
Besides making the executable of a program there are other standard things
that are placed in a Makefile, some of these operations include cleaning
up temporary files, installing the executable in a standard place, running
standard test cases, and printing the program
We could add the following rules to our Makefile:
clean: $(FILES)
rm $(FILES)
install: phone
cp phone $HOME/bin/phone
When we execute the command
make clean
All the .o files for the phone program will be deleted, note that a file
called clean will not be produced, therefore, each time we execute this
command the .o files will be deleted - make doesn't check whether it has
actually created the file that was on the left side of the rule
C Control Statements
-
There are a number of control structures in C, we will briefly list them
here:
{ statements }
Conditional Statement
if ( expression )
statement
if ( expression ) if ( expression ) if (expression {
statement { statements } statements }
else else else {
statement { statements } statements }
While statement
while ( expression ) while ( expression )
statement { statements }
Do-while or Do-until statement
do
statement
while ( expression ) ;
The iteration continues while the expression evaluates to a non-zero value,
a zero value ends the iterating
For statement
for ( expression1 ; expression2 ; expression3 )
statement
All three expressions are optional
-
expression1 is evaluated once at the start of the loop
-
expression2 is evaluated at the start of each iteration, if the value of
this expression is 0 the loop is exited
-
expression3 is executed at the end of each iteration
The for statement is equivalent to:
expression1;
while ( expression2 ) {
statement
expression3;
}
Switch statement
switch ( expression ) {
case constant-expression : statement
case constant-expression : statement
.
.
.
default : statement
} ;
The expression in the switch must evaluate to an integer value,
and all the case labels must also have integer values. When
the switch statement is entered the expression is evaluated and control
transfers to the matching case (like a goto statement). If there is no
match, the [optional] default case is used. Control flows from one case
to the next. The switch statement is NOT automatically exited
when the next case is reached
Break
break ;
The break statement causes the termination of the smallest enclosing while,
do, for or switch statement This statement is commonly used to separate
the cases in a switch statement
Continue
continue ;
Causes control to transfer to the loop continuation (end of the loop) portion
of the smallest enclosing while, do or for statement This statement is
used to transfer control to the next iteration of the loop, that is terminate
the current iteration and start on the next iteration
Example
-
In the example we use the following strategy to read from a file or terminal
read first line
while ( not end ) {
process the line
read next line
}
The main problem with this approach is the need for two read statements,
a better schema (plan) uses the break statement
while ( true ) {
read next line
if ( end )
break;
process the line
}
Press here to see the
example program (under construction)
Basic C Types
-
C has a small number of basic data types, which are:
-
char a character
-
int an integer
-
float floating point number
-
double floating point number
-
I also consider pointers to be a basic data type.
-
A pointer is always the size of a machine address and is treated like an
unsigned integer
-
There are three modifiers that can be applied to most of the basic data
types
-
The long modifier specifies that the maximum number of bits is used to
represent the value, for example on most machines a float is 32 bits and
a double is 64 bits, in reality a double is a long float
-
The long modifier is often used with the int type. A long int uses
the maximum number of bits for an integer. On most machines an int
and a long int are often the same data type, but this is not always the
case
-
The short modifier is used to specify that the minimum number of bits should
be used, it is usually used with int's - a short int is 16 bits on most
machines
-
The unsigned modifier can be used with character and integer data types,
an unsigned value doesn't interpret the sign bit as a sign, it is used
as part of the value, in other words unsigned values are always positive
and use all the bits in the value - this is used in bit manipulation operations
-
A variable declaration has the following format:
type variable_name = initial_value ;
-
The initial value part of the declaration is optional
Several variables can be declared at the same time, the variable names
are separated by commas
A pointer is declared in the following way
type* variable_name ;
The type specifies value that is being pointed to
Constants
-
Character constants are enclosed in single quotes, the \ is used as an
escape character - for example '\037' is the character with octal value
37 (i.e a %), while '\n' is the end of line character, '\012'.
-
An integer constant is a number without a decimal point, a long integer
constant is an integer constant with L as a suffix - for example 123L is
a long integer constant
-
An integer that starts with the digit 0 is a non-decimal constant.
If the 0 is followed by another digit it is an octal constant - for example
037 is an octal integer constant
-
If the 0 is followed by an x or X, the constant is a hexadecimal integer
constant - for example 0x1f is a hex constant (its decimal value is 1*16^1
+ 15*16^0 = 31
-
Double or floating point constants are numbers that have decimal points
or exponents
Arrays
-
Arrays and pointers are very closely related in C
-
C basically supports one dimensional arrays, the first subscript value
is always 0
-
An array is declared in the following way:
type array_variable [ size ] ;
The type is the type of the individual array elements and size is the length
of the array
For example
int a[1000];
will produce an array that has 1000 integers in it, the first element is
a[0] and the last element is a[999]
Arrays are initialized in the same way as basic data types, except a list
of values must be specified
For example
int a[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
A text string is a one dimensional array of char elements
For example, a text string can be declared in the following way:
char string[25];
A character string constant is enclosed in double quotes ("), for example
"this is a text string" - there is a major difference between single and
double quotes
In C and Unix a text string is terminated by a zero byte, this is written
as '\0'. Thus an empty string requires one byte of storage. In general
a string with n characters requires n+1 bytes of storage - be careful to
allocate the extra byte
Arrays and Pointers
-
An array name is really a pointer, it points to the first element of the
array
-
For example:
int a[100];
int* pa;
pa = a;
Both pa and a point to the first element of the array of 100 integers,
both of the following reference the same array element:
a[i] and *(pa+i)
The expression pa+i, takes the value of the pointer pa and adds i elements
to it, thus pa+i points to the i'th element of the array a, the * operator
treats its operand as a pointer and retrieves the value at that address,
thus *(pa+i) first computes the address of element i, and then retrieves
its value
The & operator is used to compute the address of a variable, for example
pa = &a[5];
stores the address of the 5'th element of a in pa. Thus the expression
*(pa+2)
retrieves the value of a[7]
There is an important distinction between array and pointer declarations,
an array declaration allocates storage for the array while a pointer declaration
only allocates enough storage to store the pointer itself, no storage for
the value pointed to is allocated
We can of course have an array of pointers, it is declared in the following
way
int* pa[10];
This produces an array containing 10 pointers to integers
Multi-Dimensional Arrays
-
A multi-dimensional array is just an array of arrays
-
Thus we can declare a 2 dimensional array in the following way:
int a[10][20];
This is in fact 10 arrays, each of which contains 20 integer elements
An individual element of this array can be referenced in the following
way
a[i][j]
Of course we can still do things like the following
int a[10][20];
int* pa;
pa = a;
x = *(pa+i*20+j);
Similarly we can do things like:
int a[10][20];
int (*pa)[20];
pa = &a[0];
Then (*pa)[5] would be the same as a[0][5]
An observation, a variable is declared in the way that it is used
The type in a variable declaration is usually a basic type, the declaration
syntax shows how a value of that basic type can be obtained from the variable
name
Also note that [] has higher precedence than *, so we need to use parenthesis
in the above declaration, otherwise we would have an array of 20 pointers
to integers instead of a pointer to an array of 20 integers
Character Arrays and Initialization
-
An array of pointers to text strings (which is not the same as a 2 dimensional
array of characters) is often used in C programming, such a structure can
be initialized in the following way:
char* name[] = {
"fred",
"george",
"paul",
"mark",
NULL
};
The last line of the initialization isn't necessary, it serves as a marker
for the end of the array, we can detect the end of the array in the following
way:
for(i=0; i<1000; i++) {
if(name[i] == NULL)
break;
}
Structures
-
C structures are similar to records in Pascal, they allow us to collect
together several pieces of related data into one data structure, the individual
pieces of data are called structure elements or structure members
-
A structure is declared in the following way:
struct structure_name {
member declaration;
member declaration;
.
.
.
member declaration;
};
This declaration doesn't allocate any memory, it just provides a template
for the structure, it declares the values that can be stored together
The individual structure elements can be of any C type, including other
structures
For example, we could have the following for the declaration of a name
structure
struct name {
char* first;
char* last;
};
The name structure has two elements, the character pointers first and last,
variables can have the same names as structure elements and the same element
name can be used in different structure declarations
There are several ways that we can declare a variable that has a structure
type
One way is to do the following:
struct structure_name variable_name;
So with our example name struct we could do the following:
struct name fred, george;
Another way of doing this is:
struct structure_name {
member declarations
} variable_name;
In this approach we combine the structure declaration with the variable
declaration, the structure name is optional, but it is always a good idea
to include it
A structure variable can also be initialized, this can be done in the following
way:
struct structure_name variable_name = { element values } ;
So for our example we could have:
struct name george = { "george", "brown" };
The element values are assigned in the same order as the element declarations
The . (dot) operator is used to extract the individual elements from a
structure value, in the case of our name structure we can do the following:
george.first
george.last
In general the syntax is
variable_name . element_name
A slightly different syntax is used for pointers to structures, for example
struct name* person;
The variable person is now a pointer to a name structure, knowing what
we know about pointers we can use the following to get the value of the
first element of the structure that person points to
(*person).first
or
person->first
The -> operator takes a pointer to a struct, follows the pointer to the
structure value and then extracts the field - this is a shorthand, but
it makes sense
We can have arrays of structures, which is often quite convenient, this
is done in the following way:
struct structure_name variable_name [ size ];
In the case of our name structure, we could have an initialized array of
names constructed in the following way:
struct name persons[] = {
{ "george", "brown" },
{ "fred", "black" },
.
.
.
{ NULL, NULL }
};
Again we use explicit NULL pointers at the end of the array to indicate
tre are no more elements. There are other ways of doing this, but
this is the safest
We can include pointers to a structure within the declaration of the structure,
we use this technique to build linked lists and binary trees
We can use the following structure declaration for a node in a binary tree
struct node {
int value;
struct node* left;
struct node* right;
};
Note that left and right must be pointers to structures, they cannot be
structure variables, otherwise we will have a structure that includes two
copies of itself
The same thing can be done for a linked list:
struct list_node {
float value;
struct list_node* next;
};
Fields
-
The structure elements that we have seen so far have been standard C data
types, these data types may not be the most efficient way of storing data
in a structure
-
Fields allow us to pack data into a structure as densely as possible, a
field is an integer value (either signed or unsigned) where the programmer
specifies the number of bits occupied by the field value
-
We can define fields in the following way:
struct example {
int field_a : 4;
int field_b : 6;
unsigned field_c : 6;
};
In this structure all three fields would be packed into a 16 bit word,
the first two fields are signed and the last one is unsigned
Lvalues and Rvalues
-
Lvalues and Rvalues are important in understanding expressions in C
-
An Lvalue is anything that can be on the left side of an assignment operators,
in other words it represents a memory location where a value can be stored
- Lvalues include variables and pointer expressions
-
An Rvalue is anything that can be on the right side of an assignment operator,
in other words it represents an expression or value
-
All expressions are Rvalues, but only some of them can be used as Lvalues,
in other words any place that an Rvalue is required an Lvalue can be used,
but the opposite is not true
Assignment Expression
-
There are several forms of assignment expressions, note that assignment
is an expression, it has a value and can be used as part of a larger expression
-
The general format of an assignment expression is
Lvalue assignment_operator Rvalue
The standard assignment_operator is =, but there are several other useful
ones, such as:
+=
-=
*=
/=
These operators are interpreted in the following way:
x op=y
is the same as
x = (x op y)
Operators
-
C has the standard arithmetic operators:
+
-
*
/
% - modulus or remainder
The standard comparison operators are:
==
!=
<
<=
>
>=
Logical Operators
-
Recall, that in C zero is treated as false and non-zero is treated as true
-
The ! is the logical invert operator, if the operand is non-zero the result
is zero and if the operand is zero the result is 1
-
There are two binary logical operators:
&& - logical and
|| - logical or
These operators don't always evaluate their second operand. In the
case of &&, if the first operand evaluates to zero, the second
operand is not evaluated since the result is already zero
Similarly for ||, if the first operand evaluates to a non-zero value the
second operand won't be evaluated since the result will be 1
The allows us to write doubtful expressions like:
if(n != 0 && m/n > 5) ...
Increment and Decrement Operators
-
The ++ and -- operators are used to increment and decrement Lvalues, they
can be used as both a prefix or postfix operator
-
If they are used as prefix operators the value of the expression is the
new value of the Lvalue, for example
m = ++n;
is the same as
n = n+1;
m = n;
If they are used as a postfix operator, the value of the expression is
the value of the Lvalue before the operation is performed, for example
m = n++;
is the same as
m = n;
n = n+1;
Conditional Expression
-
A conditional expression has the following syntax:
expression1 ? expression2 : expression3
First expression1 is evaluated, if it is non-zero then expression2 is evaluated
and used as the value of the expression, in this case expression3 is ignored
If expression1 is zero then expression3 is evaluated and used as the value
of the expression, in this case expression2 is ignored
This operator can be used in the following way
x = n != 0 ? m/n : 0;
The conditional operator essentially allows the programmer to put an if
statement in the middle of an expression
Procedures - Part 1
-
There are two ways of declaring and defining procedures in C, the old way
and the ANSI standard way - you will run into both so we will cover both
-
All procedures in C are really functions, that is they return a value,
the special type void is used to indicate that the return value is never
used, therefore, one is not produced
-
C has both procedure declarations and procedure definitions - these are
two different concepts
-
A procedure declaration contains all the information required to call the
procedure, that is the name of the procedure, the types of its parameters
(optional) and the type of the return value
-
A procedure definition includes all the information in a procedure declaration,
plus local variables and the statements in the procedure - it not only
describes how the procedure can be called, but also how it computes its
value
Procedure Declarations
-
The old style of procedure declaration is:
type procedure_name();
The type is the type of the return value
The ANSI style of procedure declaration is:
type procedure_name(parameter_declarations);
The parameter declarations are separated by commas and each declaration
has the following format:
type parameter_name
This is the same format as variable declarations
The ANSI style procedure declarations should be used
Procedure Definitions
-
The old style of procedure definition is:
type procedure_name(parameters)
parameter declaration;
parameter declaration;
.
.
.
parameter declaration; {
variable declarations
statements
}
The parameters are a comma separated list of parameter names. The
parameter declarations need not be in the same order as the parameter_names
T advantage of this format is that there is a separate line for each parameter,
ao they are easier to document
The ANSI style of procedure definition is:
type procedure_name(parameter_declarations) {
variable declarations
statements
}
In all cases the return statement is used to specify the value returned
by the procedure and return control to the calling procedure
The two formats of the return statement are:
return;
and
return(Rvalue);
The first form is only used with procedures of type void
Parameter Passing
-
All parameter passing in C is by value, that is, when a procedure is called
the parameter values from the calling procedure are copied into temporary
storage in the called procedure - all modifications to the parameter values
occurs in this temporary storage, the original values in the calling procedure
are not changed
-
This means that you cannot return a result directly through a parameter.
The indirect approach is to use a parameter that points to the variable
where the result should be stored
-
Remember arrays are the same as pointers, so if you pass an array (not
an array element) to a procedure, you can modify the elements in the array,
and these modifications will be seen outside of the procedure
foo(int x) {
we can do anything we like to x inside this procedure
the calling procedure won't see any of these changes
the calling procedure only provides the initial value of x
}
foo(int* x) {
*x = 5;
}
foo(&i)
printf("%d",i);
This will print 5, since we have passed a pointer to i (that is, &i)
into the procedure. The value pointed at is changed within the procedure
- NOTE that foo(i) will cause all sorts of problems if i isn't a pointer